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1.0  INTRODUCTION 

At  last  years’  NATO  workshop  “Massive  Military  Data  Fusion”  in  Norway  2002,  among  others,  it  was 
agreed  that  there  is  an  increasing  need  for  visualisation  of  data.  This  need  is  a  increased  by  obtaining  more 
data  coming  from  more  sensors  and  different  sources. 

In  the  intelligence  context,  the  fusion  of  information  is  a  multi-layer  process.  First,  raw  data  are  collected  and 
filtered  and  brought  to  archives.  The  analysis  of  the  data  is  the  most  time  consuming  process,  since  it  can  be 
automated  only  in  a  limited  range.  Thus,  there  is  a  need  for  a  “human-in-the-loop”.  For  humans,  the  best  way 
to  analyse  amounts  of  data  is  their  pattern  recognition  abilities,  thus  using  powerful  algorithms  of 
visualisation.  In  this  paper,  we  will  present  our  approaches  to  visualisation  in  different  levels. 

First,  we  will  describe  data  modelling  in  the  intelligence  context,  presenting  our  approach  of  the  MEDAV 
archive  as  data  model  and  the  stages  at  which  visualisation  is  a  powerful  means.  In  the  following  section, 
we  define  approaches  to  visualisation,  and  further  on,  show  our  approaches  to  visualisation  for  different 
information  retrieval  tasks. 


2.0  DATA  MODELLING 

One  of  the  most  important  parts  of  the  information  fusion  task  is  the  data  model.  The  purpose  of  data 
modelling  is  to  develop  an  accurate  model,  or  graphical  representation,  of  the  client's  information  needs  and 
business  processes.  The  data  model  acts  as  a  framework  for  the  development  of  the  new  or  enhanced 
application.  Since  the  amount  of  data  to  be  analysed  can  be  enormous,  the  data  model  as  a  core  of  the  system 
must  be  efficient  and  well-organised  for  the  domain  of  diverse  intelligence. 

Data  modelling  can  be  defined  as  the  design  of  the  logical  and  physical  structure  of  one  or  more  databases  to 
accommodate  the  information  needs  of  the  users  in  an  organization  for  a  defined  set  of  applications.  In  the 
intelligence  context  the  basic  need  is 

•  to  offer  possibilities  to  store  huge  amounts  of  data, 

•  allow  access  to  all  data  without  regard  to  the  type  of  original  data,  e.g.  recording  source,  text  or 
speech  signal. 
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The  amount  of  data  is  rising,  since  data  come  from  more  and  more  input  channels.  Channels  in  a  typical 
application  in  the  intelligence  context  may  be  speech  from  telephones,  mobiles,  HF  and  many  other  sources. 
Signals  are  found  that  might  incorporate  speech  in  some  parts.  Furthermore,  information  is  gathered  from 
texts  from  different  sources,  e.g.  internet,  mails,  newspapers.  Another  source  of  information  is  images,  such  as 
from  satellites.  All  these  sorts  of  data  have  to  be  analysed  according  to  a  special  task  or  question.  For  each  of 
the  channels,  the  way  of  analysing  data  may  be  different,  but  none  of  the  channels  may  be  neglected. 

A  good  data  model  finds  an  efficient  way  in  which  input  from  different  channels  is  stored  in  a  somewhat 
similar  way  in  order  to  access  the  data  for  further  analysis  in  a  similar  manner.  Usually,  data  would  be  stored 
according  to  different  criteria,  e.g.  depending  on  the  source  of  the  data.  Another  criterion  may  be  the  type  of 
pre-processing  that  classifies  the  data  into  different  categories,  and  attaches  the  gathered  meta  information  to 
each  datum. 

For  the  analysis  of  data,  there  is  a  need  for  a  human-in-the-loop,  i.e.  at  some  point,  a  human  operator  must 
check  and  evaluate  the  data  or  the  automatically  gained  analysis.  The  work  of  the  human-in-the-loop  can  be 
assisted  in  different  ways,  mainly  in  two  different  types:  1)  pre-selection  of  data  that  are  of  more  relevance 
than  other  sets  in  the  data.  This  can  be  performed  automatically  up  to  a  certain  level..  2)  analysis  of  data, 
among  others  by  means  of  visualisation.  The  main  advantage  of  the  using  visualisation  at  this  step  of 
processing  is  that  humans  tend  to  quickly  understand  complex  issues  when  presented  visually.  Thus,  the 
process  of  analysis  is  speeded  up  when  using  visualisation. 

In  the  next  section  we  will  describe  our  approach  to  data  modelling  with  the  MEDAV  archive  in  more  detail. 


3.0  DATA  MODELLING  IN  THE  MEDAV  ARCHIVE 

For  the  past  20  years,  MEDAV  GmbH  has  been  a  developer  of  intelligence  processing  hardware  and  software. 
We  have  been  engaged  in  studies  and  commercial  projects  for  the  German  Federal  Armed  Forces  and  for 
other  German  and  international  government  agencies.  Our  challenge  has  been  to  develop  hardware  and 
software  systems  that  incorporate  the  different  levels  of  information  fusion  within  the  one  architecture. 

The  need,  especially  in  the  intelligence  context,  is  to  archive  huge  amounts  of  data  of  different  sources  and  to 
find  all  kinds  of  information  regardless  of  the  format  the  data  are  stored.  The  goals  are: 

•  Allow  storage  of  amounts  of  data  coming  from  a  variety  of  input  sources. 

•  Retrieve  information  within  all  data  types  in  the  same  manner,  regardless  of  the  type  of  source. 

The  task  to  be  solved  by  the  archive  is  to  search  information  or  documents  in  the  archive  and  for  example  to 
print  out  the  relevant  documents.  This  task  can  be  solved  with  our  data  modeling  together  with  the  MEDAV 
archive. 

The  main  principle  is,  that  all  data  are  stored  together  with  attributes.  The  attributes  consist  of  standard 
attributes  like  the  date,  size  of  source  data  as  well  as  with  an  unlimited  number  of  additional  attributes  that  are 
defined  by  the  application  and  by  the  users.  Possible  attributes  are  the  author  of  a  documents,  the  transcription 
(of  sound  files  for  example),  keyword  or  other  comments  on  the  content  of  a  data  set.  New  attributes  can  be 
added  at  any  time  when  desired  by  the  user  of  the  archive.  The  data  are  compressed  and  can  be  encrypted. 
The  access  can  be  limited  to  special  data  and  a  special  user  group.  The  access  to  the  archive  can  be  logged. 
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The  search  can  easily  be  performed  within  the  attributes.  When  a  datum  is  stored,  it  is  also  entered  into  a 
searchable  index  file.  Furthermore,  the  data  can  be  removed  from  the  archive,  but  the  index  can  still  be 
searched  further  on. 

Most  importantly,  the  front-end  towards  the  user  remains  the  same  for  different  tools  inside  the  archive. 
All  types  of  data  can  be  searched  in  the  same  way  regardless  of  their  origin.  No  knowledge  about  data 
modelling  detail  or  the  data  architecture  is  necessary  for  the  user. 

An  overview  of  the  data  modelling  in  the  MEDAV  archive  is  given  in  Figure  1. 
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Figure  1:  Data  modeling  in  the  MEDAV  archive. 


At  the  bottom  on  the  left  the  sources  S  are  shown.  Input  is  gathered  from  different  sensors.  These  data  pass  a 
pre-processing  step,  in  which  they  are  classified  (CL)  using  a  target/mission  data  base  TM.  Data  that  are  not  of 
interest  are  filtered  to  the  garbage.  The  kept  data  are  passed  to  a  feature  extraction  step  obtaining  meta  data 
information  that  may  be  of  interest  for  further  analysis. 

The  data  are  labeled  with  additional  information  like  recording  time  and  spoken  language.  For  each  data  file 
that  is  obtained  that  way,  information  is  attached  as  meta  data  (MD)  that  may  the  used  for  a  faster  search  than 
when  looking  up  in  the  source  data.  These  raw  data  are  stored  in  an  archive. 
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Each  information  source  is  kept  separately,  as  HF,  VUHF,  MobR  in  the  example  in  Figure  1.  The  raw  data 
part  of  the  archive  contains  the  original  data  together  with  the  additional  meta  data.  For  further  analysis, 
a  reference  to  the  original  data  is  always  added  in  order  to  be  able  to  trace  the  analysis  back  to  the  original 
data,  for  example  for  verification. 

The  next  step  in  the  archive  is  a  selection  of  the  data  (SEL)  with  respect  to  the  information  retrieval.  Selected 
data  according  to  a  choice  of  criteria  are  put  to  the  Coll  part  of  the  data  base.  Selection  can  be  performed 
automatically  for  some  task  as  well  as  manually  by  a  human  operator.  In  this  step,  information  fusion  can  be 
performed  both  by  selection  of  a  number  of  data  files  as  well  as  by  reducing  each  data  file  itself,  e.g.  cutting 
speech  signals  to  the  relevant  parts  or  by  summarizing  text.  Visualisation  of  data  files  is  a  feature  that  is  very 
helpful  at  this  step.  A  request  to  the  selection  of  raw  data  may  be  “show  all  data  containing  KEYWORD1 
and  KEYWORD  2  and  make  statistics  about  the  frequency  of  occurrence  in  a  certain  time  frame”.  Another 
application  could  be  to  make  text  summaries  of  text  files  containing  certain  keywords  or  by  a  certain  author. 

The  result  of  the  information  fusion  is  a  collection  of  relevant  data  and/or  summaries  and  statistics  of  the  raw 
data  base.  Now,  the  evaluation  is  completed  by  writing  a  report  ( Rep )  to  the  organisation  of  the  operator. 
The  report  contains  the  evaluation  results  from  the  collection  part  of  the  data  base.  Furthermore,  the  report 
contains  links  to  the  original  data  in  order  to  be  able  to  check  the  completeness  and  correctness  of  the  data  and 
to  provide  quality  assurance. 

During  the  analysis  of  the  raw  data  base  with  respect  to  a  certain  task,  an  aggregation  takes  place  from  raw 
data  to  collection  and  towards  report.  This  aggregation  on  the  one  hand  is  a  selection  of  the  important  data  as 
well  as  a  reduction  of  the  data  towards  to  important  features  for  a  chosen  task. 

Evaluation  of  the  data  happens  in  two  places:  first,  in  the  selection  step  and  further  on,  in  the  report. 
The  evaluation  can  be  performed  automatically  or  semi-automatically  or  manually  by  the  human  operator. 
An  interface  for  automatic  evaluation  is  provided  to  make  the  evaluation  easier  for  the  operator. 

Visualisation  in  this  application  can  be  used  at  two  stages:  in  the  process  of  selection  and  in  the  report. 
The  task  of  the  visualisation  is  to  support  the  operator  to  easier  find  important  data  and  to  make  the  obtained 
results  more  visible,  i.e.  easier  to  understand  at  a  glance. 


4.0  VISUALISATION 

As  we  have  seen  in  the  previous  section,  visualisation  is  an  important  feature  in  the  analysis  of  diverse 
intelligence.  In  this  section,  we  will  present  our  approach  to  visualisation.  First,  we  should  define  our 
understanding  of  visualisation  and  how  it  can  be  classified.  Therefore,  we  present  a  visualisation  model. 
The  human  operator  and  the  reader  of  a  report  are  the  users  of  the  output  of  visualisation.  An  important  topic 
is  the  human  interaction  between  the  human  and  the  visualisation  interface  as  well  as  the  best  way  to  show 
contents  visually. 

The  goal  of  visualisation  is,  generally  speaking,  to  make  large  data  sets  more  accessible  and  easier  to 
understand  in  short  time.  Different  means  can  be  used  for  visualisation:  graphics,  that  show  statistics  for 
example  in  a  histogram,  but  also  compressed  texts  that  are  shorter  than  the  original  texts  and  therefore  save 
time  to  read.  Additionally,  important  parts  may  be  highlighted. 


Classifying  visualisation  by  means  of  output  we  can  find  three  categories: 
•  Traditional  static  predefined  displays 
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•  Augmented  reality  displays  (where  interactive  iconic  and  textual  information  are  embedded  in 
realistic  life  scenarios) 

•  Virtual  reality  displays  (where  the  user  is  located  within  the  image  and  becomes  part  of  it) 

The  emphasis  of  this  paper  is  set  on  the  first  item,  additionally  we  will  show  some  examples  from  the  other 
types. 

4.1  Visualisation  Model 

At  last  years’  workshop,  a  visualisation  model  was  developed  that  describes  the  visualisation  process,  see 
Figure  2.  The  user  of  the  visualisation  interface  accesses  the  multimedia  displays  which  optimally  cover  the 
senses  of  humans,  i.e.  visually,  aurally  and  even  haptically  as  appropriate  for  the  application  in  question. 
The  user  interacts  with  the  multimedia  displays  in  several  modalities,  traditionally  with  keyboard  and  mouse, 
but  also  with  voice  and  gestures  according  to  the  requirements  of  the  application. 


Figure  2:  The  Visualisation  Model. 


From  the  multimedia  displays,  the  user  interacts  with  the  task  level  HMI  (human  machine  interface)  in  the 
most  natural  way,  since  at  this  level,  humans  should  not  be  burdened  with  extraneous  cognitive  tasks  required 
to  operate  a  computer  system  —  a  well-designed  HMI  should  make  the  computer  “invisible”  to  its  users, 
in  the  words  of  Donald  Norman  [Norman,  1998  #1286].  Using  the  task  level  HMI,  the  user  should  be  able  to 
obtain  the  desired  information  in  the  wanted  type  of  display,  the  HMI  controls  the  modalities  of  presentation 
by  interaction  tokens. 


4.2  Human  Interaction 

The  goal  of  visualisation  is  to  provide  easy  and  fast  understanding  of  complex  issues.  Therefore,  the  HMI 
should  be  designed  corresponding  to  natural  human  communication.  The  first  item  is  to  study  the  results  of 
physiology  and  psychology.  The  grok  box  project  (http://vader.mindtel.com/concepts.html),  among  others, 
deals  with  these  aspects. 
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A  visualisation  interface  should  be  understood  and  handled  intuitively.  It  should  be  represented  by  well- 
known  symbols.  Ideally,  the  symbols  in  the  used  iconography  have  a  relation  to  what  they  stand  for,  such  that 
understanding  the  visual  representation  becomes  easier.  Furthermore,  the  human  understands  best,  if  several 
sensory  modalities  are  employed  at  the  same  time  such  as  visual,  aural,  tactile  stimuli.  This  effect  can  be 
consolidated  by  using  different  colors,  different  sounds  etc. 

For  optical  representation  different  shapes  and  colors  should  be  used.  Large  or  frequently  occurring  aspects 
should  be  represented  by  large  symbols,  important  topics  should  be  placed  in  the  middle  and/or  be 
highlighted.  Still,  we  must  pay  attention  not  to  overload  the  image  with  too  much  information  and  too 
complex  iconography  to  keep  the  image  intuitively  understandable. 

The  visual  representation  depends  much  on  the  content  that  is  to  be  displayed.  There  is  no  best  set  of  symbols 
or  way  to  display  information.  Some  complex  data  may  need  more  interaction  with  the  user,  such  that  he  may 
want  to  change  the  view  on  the  presented  data  depending  on  his  interests. 

4.3  Our  Approach  to  Visualisation 

The  overall  principle  of  our  approach  to  visualisation  is  to  provide  an  intuitive  means  for  fast  understanding  of 
complex  issues  and  facts.  The  important  parts  of  the  presented  information  should  directly  come  into  mind 
and  view. 

During  the  design  of  a  visualisation  tool,  it  is  important  to  specify  the  preconditions  of  both  the  domain  of 
analysis  as  well  as  the  knowledge  and  preferences  of  the  user.  Thus,  an  estimation  of  effort  must  be  done 
regarding,  computing  power  that  might  limit  the  range  of  possible  visualisations.  The  preferences  of  the  user 
regarding  the  type  of  output,  and  the  display  of  information  must  be  determined.  It  must  also  be  estimated  if 
any  type  of  knowledge  is  needed  as  a  prerequisite  for  visualisation  like  a  language’s  grammar  or  an 
organisation’s  structure. 

One  need  that  we  have  found  in  our  studies  is  to  provide  a  generic  tool  that  can  be  specified  to  a  certain 
amount  by  the  user  towards  his  means.  On  the  other  hand,  the  tool  should  be  specific  enough  to  meet  the 
needs  of  the  user  providing  fast  access.  We  will  provide  different  levels  of  visualisation  ranging  from  text 
output  to  virtual  reality  graphics.  We  will  use  colors  and  highlighting  for  making  the  optical  access  easier. 
Drill-down,  i.e.  interactive  focusing  on  a  special  part  of  the  image  will  be  provided  if  desired.  Depending  on 
the  application  and  prerequisites  we  provide  different  modules  for  visualisation.  In  the  following  section,  we 
will  provide  examples  from  our  work. 


5.0  VISUALISATION  EXAMPLES 

In  studies  with  the  German  Federal  Armed  Forces,  MEDAV  has  evaluated  a  variety  of  information 
visualisation  tools.  While  it  is  true  that  64  a  picture  tells  a  thousand  words”,  we  find  that  users  need  to  have 
different  views  over  the  same  data.  For  example  a  graphical  display  of  a  military  command  structure  can  be 
usefully  complemented  by  a  simple  tabular  display  of  other  information.  However,  neither  display  by  itself  is 
adequate.  As  another  example,  a  graphical  content  summary  of  a  text  document  may  be  useful  if  the  document 
is  large  but  for  small  documents  (or  parts  of  documents),  the  user  may  prefer  textual  summaries. 

In  this  section,  we  will  demonstrate  visualisation  tools  at  different  levels  of  specialization,  representation  and 
display  from  studies  we  have  performed.  We  can  roughly  classify  the  visualisation  examples  as  shown  in 
Table  1.  The  first  two  examples  show  the  visualisation  of  signals. 
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Visualisation  at  the  signal  level  is  shown  in  Figure  3.  A  basic  approach  is  shown  in  Figure  4,  showing  a  two- 
dimensional  graph  of  communication  structures.  Figure  5  and  Figure  6  show  further  solutions  for  the 
visualisation  of  flows.  Figure  6  and  Figure  7  present  interactive  visualisation  HMIs,  the  first  one  with  a 
complex  choice  of  visualisation  options.  Figure  8  gives  an  example  of  a  virtual  reality  application.  Figure  9 
and  Figure  10  show  applications  that  work  based  on  the  analysis  of  the  contents  of  texts  and  tables  using 
artificial  intelligence  techniques. 


Table  1:  Visualisation  solutions 


Figure  3:  Sonagram  in  3D,  Eye  Diagram  in  3D 

Signal,  3D 

Figure  4:  Command  Hierarchy 

Classical,  clustering,  static 

Figure  5:  Bandwidth  of  a  communication  network 

Flow,  static 

Figure  6:  Transmitter  traffic  in  real  sites 

Flow,  photo,  interactive 

Figure  7:  Communication  cube 

Cube,  interactive 

Figure  8:  Virtual  Reality  Application 

Virtual  reality 

Figure  9:  Communication  of  employees 

Text  mining,  interactive 

Figure  10:  Visual  Summary 

Visual  summary,  interactive 

5.1  Visualisation  of  Signals 

At  the  level  of  signals,  there  is  already  a  use  for  visualisation.  Both  example  in  Figure  3  show  that  the  use  of 
colors,  and  three-dimensional  representation  enhance  the  easy  understanding  of  signals.  The  sonagram  can  be 
visualised  in  3D  representation  on  the  left  side  of  Figure  3.  The  frequency  is  shown  from  left  to  right,  the 
time  axis  is  from  front  to  back,  and  the  intensity  is  shown  in  the  height  and  colors.  Another  example  is  the  eye 
diagram  on  the  right  side  of  Figure  3.  The  quality  of  the  signal  can  be  seen  better  using  colors  and  3D 
representation. 
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Figure  3:  Sonagram  in  3D,  Eye  Diagram  in  3D. 
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5.2  Classical  Visualisation 

A  very  simple  and  easy-to-realise  visualisation  is  to  represent  data  in  a  diagram  or  histogram  in  2D  or  3D. 
Colors  help  to  show  correlation,  additional  graphics  may  emphasize  results.  Figure  4  shows  a  representation 
of  a  command  structure  obtained  by  cluster  analysis  of  communications  frequency.  The  command  hierarchy  is 
illustrated  in  the  vertical  dimension  while  the  colours  and  shapes  are  easily  interpreted  by  reference  to  the 
index  at  top-right.  (Fuhrung  Ebene  1  =  leadership  level  1,  Bearbeiter  =  administrative  assistant,  Admin  = 
administration).  The  y-axis  shows  written  e-mails,  the  x-axis  shows  confirmations  (or  very  short  replies)  on 
these  e-mails.  From  the  type  of  e-mails  and  their  answers,  conclusions  of  the  hierarchy  of  a  company  can  be 
drawn. 


Figure  4:  Command  Hierarchy. 


5.3  Communication  Flows 

Communication  flows  can  be  visualised  by  lines  with  different  color  or  thickness  depending  on  their  value. 
In  Figure  5  the  bandwidth  is  shown  by  the  thickness  of  the  lines  from  different  places.  For  intuitive 
understanding,  the  network  is  projected  on  to  a  map  of  Europe.  At  the  first  glance,  the  main  network  streams 
can  be  identified. 
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Figure  5:  Bandwidth  of  a  communication  network. 


Another  example  is  the  interactive  view  of  communication  streams  in  Figure  6.  Instead  of  the  map,  a  high 
resolution  3D  landscape  is  shown,  and  the  position  of  the  transmitters  is  projected  onto  the  site. 


Figure  6:  Transmitter  traffic  in  real  sites. 
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The  user  may  scroll  the  icons  at  top  of  the  graphic  and  choose  a  time  and  place  of  interest  as  well  as  a  choice 
of  options  for  the  display  of  the  transmitters.  The  color  of  the  communication  lines  distinguish  between  data 
and  speech  communication.  The  user  has  several  options  for  interaction  with  this  visualisation  interface  (top 
right  of  Figure  6). 

•  Text  drill-down:  the  user  can  select  a  transmitter  and  get  text  information  about  this  transmitter. 

•  Audio  drill-down:  the  user  can  listen  to  the  selected  channel. 

•  Wave  propagation:  shows  the  wave  propagation  around  a  selected  transmitter 

•  Labeling  of  a  transmitter:  e.g.  for  establishing  a  hierarchy  of  transmitters. 

•  Inspection  of  transmitters:  further  properties  of  the  transmitter  can  be  studied. 

•  Message:  observation  can  be  recorded  as  text. 

Another  example  of  displaying  a  communication  flow  with  many  data  is  a  communication  cube  as  in  Figure  7. 
For  example,  the  communication  flow  over  time  can  be  visualised  in  this  way.  Two  axes  in  this  cube  represent 
the  sender  and  receiver,  the  third  axis  represents  the  time  window.  The  intensity  is  shown  by  the  size  of  the 
dots  in  the  cube.  Periodically  occurring  communications  can  be  seen  at  a  glance.  The  cube  can  be  rotated  or 
moved  by  the  user  in  order  to  obtain  a  better  view  to  his  specific  field  of  interest. 


A.ctiyiti.GS  of  3  rscMvcr 


Display  model 


Acti  vities  of  a  sender 


unequal  intensity 


Figure  7:  Communication  cube. 
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5.4  Augmented  and  Virtual  Reality 

In  augmented  reality,  a  live  vision  is  combined  with  additional  information  gathered  from  some  type  of 
analysis.  For  example,  a  photograph  of  a  city  is  enriched  by  data  classifying  the  buildings  in  the  photograph. 
Interactive  systems  in  augmented  reality  have  special  requirements  concerning  real-time  processing  in  order  to 
obtain  an  alignment  between  the  calculation  and  the  ongoing  reality. 

In  virtual  reality,  the  real  image  is  substituted  by  a  virtual  scenery.  There  is  an  interaction  between  the  human 
user  and  the  virtual  reality.  An  example  is  shown  in  Figure  8.  A  special  requirement  is  the  fast  and  exact 
estimation  of  new  positions,  in  this  case  of  the  position  of  the  hands  of  the  person  as  well  as  an  estimation  of 
the  view  and  the  eye  position  of  the  user.  In  addition,  the  technique  for  displaying  the  virtual  reality  may  be 
complex  for  the  calculation  of  the  current  position  and  the  update  after  a  movement  of  the  user. 


Figure  8:  Virtual  Reality  Application. 


5.5  Data/Text  Mining 

In  the  examples  presented  until  now,  the  input  to  visualisation  are  numbers,  signals.  The  visualisation  process 
presents  the  data  as  they  were  in  a  format  that  can  easily  and  intuitively  be  understood.  Text  mining  as  a 
specialisation  of  data  mining  takes  texts  as  input  and  visualises  the  contents  after  analysing  them  according  to 
algorithms  deriving  from  artificial  intelligence  research. 

The  data  mining  technique  is  employed  when  large  amounts  of  data  must  be  analysed.  After  processing  of  the 
data  and  a  classification  process,  correlations,  trends,  and  dependencies  can  be  found. 

Automatic  clustering  can  lead  to  a  categorisation  of  the  input  data  into  different  classes  as  the  circles  in 
Figure  4,  that  show  to  which  group  of  employees  a  person  belongs  to  with  the  highest  probability.  Another 
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strategy  is  visualisation  of  dependencies  by  means  of  a  classification  tree.  A  classification  tree  is  also  trained 
automatically  and  can  draw  conclusions  after  analysing  a  data  base.  For  example,  let  us  have  a  data  base  with 
information  about  car  drivers  including  their  age,  the  size  of  their  car  and  their  frequency  of  accidents.  Using 
this  data  base,  a  classification  tree  can  be  trained  that  estimates  the  probability  of  a  person  being  involved  in 
an  accident. 

Association  techniques  can  be  used  to  find  correlations  between  different  events.  As  an  example,  shopping 
lists  in  the  supermarket  often  show  strong  correlation  between  different  products.  As  a  result,  the  products  in  a 
supermarket  can  be  sorted  according  to  this  correlation.  This  process  can  be  automated. 

Another  feature  is  fuzzy  search.  Looking  for  a  person  whose  name  or  orthographic  spelling  is  not  known 
exactly  becomes  easier  when  allowing  fuzzy  search.  The  suspected  name  is  entered  into  the  category  first 
name  or  last  name  and  the  persons  in  the  data  base  are  returned  who  most  resemble  the  data  entry. 

Text  mining  takes  text  or  tables  as  input.  Given  a  data  base  containing  data  of  the  internal  communication  of 
employees,  different  results  can  be  found  using  text  mining:  Figure  9  shows  the  habits  of  communication  for 
two  different  users.  The  user  on  the  left  side  prefers  to  write  the  majority  of  communication  in  the  morning, 
the  user  on  the  right  side  has  a  quite  stable  quota  during  the  day,  with  a  peak  towards  the  evening.  This  way, 
people  can  be  identified.  Using  other  views,  the  hierarchy  of  the  company  can  be  found,  explained  by  the  fact, 
that  each  employee  writes  preferably  to  his  boss. 


Figure  9:  Communication  of  employees. 


5.6  Visual  Summary 

Another  application  for  visualisation  is  to  generate  a  visual  summary  from  text  files  of  any  kind.  There  are 
two  possibilities: 

1)  text  tagging:  from  each  text,  frequent  words  and  keywords  are  extracted  and  brought  into  relation. 
The  output  consists  of  the  original  text  with  highlighted  keywords.  In  addition,  the  highlighted  words 
contain  cross-references  to  other  keywords.  Statistics  of  the  text  are  produced. 
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2)  visual  summary:  the  most  frequent  words  of  a  text  or  the  associations  between  words  are  presented  as 
interactive  graphic. 

For  the  visual  summary,  keywords  are  extracted  from  texts:  these  keywords  can  be  determined  automatically 
as  the  most  frequent  words  in  a  text.  From  the  list  of  keywords,  function  words  (the  words  that  are  frequent  in 
any  text)  are  removed  resulting  to  a  list  of  words  that  are  important  to  this  special  content. 

Now,  associations  between  the  keywords  are  estimated,  i.e.  the  frequency  of  words  occurring  in  each  others’ 
neighbourhood.  Drawing  an  image  of  the  keywords  with  lines  between  the  words  as  an  indication  of  their 
association,  interesting  relations  between  persons  and  actions  become  visible  as  in  Figure  10. 

The  image  shows  the  frequency  of  keywords  in  a  source  text  of  130,000  words.  The  large  dots  describe  the 
frequency  of  the  keywords.  The  thick  lines  show  the  associated  words.  With  this  type  of  graphic, 
the  dependencies  in  complex  texts  can  be  found  automatically.  On  the  right  side  more  detailed  information  on 
each  word  is  given,  e.g.  the  frequency  of  each  word,  the  most  associated  words,  links  to  other  occurrences  of 
the  same  word  etc. 


Figure  10:  Visual  Summary. 
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6.0  SUMMARY 

A  picture  tells  a  thousand  words.  This  is  true,  if  the  picture  is  well  developed  and  designed.  For  information 
fusion  and  the  visualisation  of  large  amounts  of  data,  it  is  very  important  -  and  becoming  more  and  more 
important  -  that  there  is  a  good  means  to  reduce  the  data  in  a  sensible  way  to  quickly  obtain  the  desired 
information.  Some  of  the  ways  of  designing  visual  interfaces  are  intuitive  and  therefore  easy  to  use.  In  order 
to  present  more  complex  facts,  more  of  an  iconic  language  may  have  to  be  introduced.  Still,  it  seems  quite 
difficult  to  learn  a  new  iconography  that  is  not  obviously  intuitive. 

This  paper  showed  the  results  of  our  work  on  visualisation  that  we  carried  out  in  order  to  get  the  most  intuitive 
visualisation  tools  for  each  of  the  many  applications  in  our  studies.  As  shown,  there  is  a  huge  potential  for  the 
analysis  of  data  bases.  Another  goal  that  will  be  realised  in  the  future  is  to  automatically  detect  the  similarity 
of  data.  Using  this  technique,  one  can  easily  find  for  example  all  documents  belonging  to  the  same  topic. 
For  the  future,  it  remains  a  challenging  task  for  us  to  continue  finding  more  and  intuitive  visualisation 
interfaces  for  emerging  applications. 
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